Unconstrained Speech Separation by Composition of Longest Segments
نویسندگان
چکیده
A data-driven approach is presented for improving the performance of separating single-channel mixed speech signals, assuming unknown, arbitrary temporal dynamics. The new approach seeks and separates the longest mixed speech segments which can be accurately matched by composite training segments. Lengthening the mixed speech segments to match reduces the uncertainty of the matching constituent training segments, and hence the error of separation. Experiments are conducted on the Wall Street Journal database, for separating mixtures of large-vocabulary speech utterances. The results are evaluated using various objective and subjective measures, including the challenge of large-vocabulary continuous speech recognition. It is shown that the new separation approach leads to significant improvement in all these measures.
منابع مشابه
Quantification of identical and unique segments in ethylene-propylene copolymers using two dimensional liquid chromatography with infra-red detection
Hyphenating High Temperature High Performance Liquid Chromatography (HT-HPLC) with High Temperature Size Exclusion Chromatography (HT-SEC) (High Temperature Two Dimensional Liquid Chromatography (HT-HPLC x HT-SEC or HT 2D-LC)) leads to an isocratic elution in the second dimension, which in turn enables to use IR detector (quantitative detection) for monitoring the eluting polymers. Experimental...
متن کاملمقایسه روشهای مختلف یادگیری ماشین در خلاصهسازی استخراجی گفتار به گفتار فارسی بدون استفاده از رونوشت
In this paper, extractive speech summarization using different machine learning algorithms was investigated. The task of Speech summarization deals with extracting important and salient segments from speech in order to access, search, extract and browse speech files easier and in a less costly manner. In this paper, a new method for speech summarization without using automatic speech recognitio...
متن کاملThe Prosody of Discourse Structure and Content in the Production of Persian EFL Learners
The present research addressed the prosodic realization of global and local text structure and content in the spoken discourse data produced by Persian EFL learners. Two newspaper articles were analyzed using Rhetorical Structure Theory. Based on these analyses, the global structure in terms of hierarchical level, the local structure in terms of the relative importance of text segments and the ...
متن کاملBlind Source Separation for Speech Application Under Real Acoustic Environment
A hands-free speech recognition system [1] is essential for the realization of an intuitive, unconstrained, and stress-free human-machine interface, where users can talk naturally because they require no microphone in their hands. In this system, however, since noise and reverberation always degrade speech quality, it is difficult to achieve high recognition performance, compared with the case ...
متن کاملProblems in Blind Separation of Convolutive Speech Mixtures by Negentropy Maximization
This paper aims to examine suitability of the marginal statistics based contrast function e.g. negentropy for the separation of convolutive speech mixtures picked up by a linear microphone array. For this study we choose our frequency domain fixed-point ICA algorithm, based on negentropy maximization of the independent components. This algorithm is based on the heuristic assumption, in accordan...
متن کامل